local convergence
On Local Limits of Sparse Random Graphs: Color Convergence and the Refined Configuration Model
Local convergence has emerged as a fundamental tool for analyzing sparse random graph models. We introduce a new notion of local convergence, color convergence, based on the Weisfeiler-Leman algorithm. Color convergence fully characterizes the class of random graphs that are well-behaved in the limit for message-passing graph neural networks. Building on this, we propose the Refined Configuration Model (RCM), a random graph model that generalizes the configuration model. The RCM is universal with respect to local convergence among locally tree-like random graph models, including Erd os-Rényi, stochastic block and configuration models. Finally, this framework enables a complete characterization of the random trees that arise as local limits of such graphs.
A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees
Finding an $\epsilon$-stationary point of a nonconvex function with a Lipschitz continuous Hessian is a central problem in optimization. Regularized Newton methods are a classical tool and have been studied extensively, yet they still face a trade off between global and local convergence. Whether a parameter-free algorithm of this type can simultaneously achieve optimal global complexity and quadratic local convergence remains an open question. To bridge this long-standing gap, we propose a new class of regularizers constructed from the current and previous gradients, and leverage the conjugate gradient approach with a negative curvature monitor to solve the regularized Newton equation. The proposed algorithm is adaptive, requiring no prior knowledge of the Hessian Lipschitz constant, and achieves a global complexity of $O(\epsilon^{-\frac{3}{2}})$ in terms of the second-order oracle calls, and $\tilde O(\epsilon^{-\frac{7}{4}})$ for Hessian-vector products, respectively. When the iterates converge to a point where the Hessian is positive definite, the method exhibits quadratic local convergence. Preliminary numerical results, including training the physics-informed neural networks, illustrate the competitiveness of our algorithm.
On Local Limits of Sparse Random Graphs: Color Convergence and the Refined Configuration Model
Local convergence has emerged as a fundamental tool for analyzing sparse random graph models. We introduce a new notion of local convergence,, based on the Weisfeiler-Leman algorithm. Color convergence fully characterizes the class of random graphs that are well-behaved in the limit for message-passing graph neural networks. Building on this, we propose the (RCM), a random graph model that generalizes the configuration model. The RCM is universal with respect to local convergence among locally tree-like random graph models, including Erdős-Rényi, stochastic block and configuration models. Finally, this framework enables a complete characterization of the random trees that arise as local limits of such graphs.
On Local Limits of Sparse Random Graphs: Color Convergence and the Refined Configuration Model
Pluska, Alexander, Malhotra, Sagar
Local convergence has emerged as a fundamental tool for analyzing sparse random graph models. We introduce a new notion of local convergence, color convergence, based on the Weisfeiler-Leman algorithm. Color convergence fully characterizes the class of random graphs that are well-behaved in the limit for message-passing graph neural networks. Building on this, we propose the Refined Configuration Model (RCM), a random graph model that generalizes the configuration model. The RCM is universal with respect to local convergence among locally tree-like random graph models, including Erdős-Rényi, stochastic block and configuration models. Finally, this framework enables a complete characterization of the random trees that arise as local limits of such graphs.
Primal-Dual iLQR
Sousa-Pinto, João, Orban, Dominique
We introduce a new algorithm for solving unconstrained discrete-time optimal control problems. Our method follows a direct multiple shooting approach, and consists of applying the SQP method together with an $\ell_2$ augmented Lagrangian primal-dual merit function. We use the LQR algorithm to efficiently solve the primal-dual Newton-KKT system. As our algorithm is a specialization of NPSQP, it inherits its generic properties, including global convergence, fast local convergence, and the lack of need for second order corrections or dimension expansions, improving on existing direct multiple shooting approaches such as acados, ALTRO, GNMS, FATROP, and FDDP. The solutions of the LQR-shaped subproblems posed by our algorithm can be be parallelized to run in time logarithmic in the number of stages, states, and controls. Moreover, as our method avoids sequential rollouts of the nonlinear dynamics, it can run in $O(1)$ parallel time per line search iteration. Therefore, this paper provides a practical, theoretically sound, and highly parallelizable (for example, with a GPU) method for solving nonlinear discrete-time optimal control problems. An open-source JAX implementation of this algorithm can be found on GitHub (joaospinto/primal_dual_ilqr).
Local convergence of min-max algorithms to differentiable equilibrium on Riemannian manifold
We study min-max algorithms to solve zero-sum differentiable games on Riemannian manifold. The notions of differentiable Stackelberg equilibrium and differentiable Nash equilibrium in Euclidean space are generalized to Riemannian manifold, through an intrinsic definition which does not depend on the choice of local coordinate chart of manifold. We then provide sufficient conditions for the local convergence of the deterministic simultaneous algorithms $\tau$-GDA and $\tau$-SGA near such equilibrium, using a general methodology based on spectral analysis. These algorithms are extended with stochastic gradients and applied to the training of Wasserstein GAN. The discriminator of GAN is constructed from Lipschitz-continuous functions based on Stiefel manifold. We show numerically how the insights obtained from the local convergence analysis may lead to an improvement of GAN models.
A Momentum Accelerated Adaptive Cubic Regularization Method for Nonconvex Optimization
The cubic regularization method (CR) and its adaptive version (ARC) are popular Newton-type methods in solving unconstrained non-convex optimization problems, due to its global convergence to local minima under mild conditions. The main aim of this paper is to develop a momentum-accelerated adaptive cubic regularization method (ARCm) to improve the convergent performance. With the proper choice of momentum step size, we show the global convergence of ARCm and the local convergence can also be guaranteed under the \KL property. Such global and local convergence can also be established when inexact solvers with low computational costs are employed in the iteration procedure. Numerical results for non-convex logistic regression and robust linear regression models are reported to demonstrate that the proposed ARCm significantly outperforms state-of-the-art cubic regularization methods (e.g., CR, momentum-based CR, ARC) and the trust region method. In particular, the number of iterations required by ARCm is less than 10\% to 50\% required by the most competitive method (ARC) in the experiments.
On Convergence of Gradient Descent Ascent: A Tight Local Analysis
Li, Haochuan, Farnia, Farzan, Das, Subhro, Jadbabaie, Ali
Gradient Descent Ascent (GDA) methods are the mainstream algorithms for minimax optimization in generative adversarial networks (GANs). Convergence properties of GDA have drawn significant interest in the recent literature. Specifically, for $\min_{\mathbf{x}} \max_{\mathbf{y}} f(\mathbf{x};\mathbf{y})$ where $f$ is strongly-concave in $\mathbf{y}$ and possibly nonconvex in $\mathbf{x}$, (Lin et al., 2020) proved the convergence of GDA with a stepsize ratio $\eta_{\mathbf{y}}/\eta_{\mathbf{x}}=\Theta(\kappa^2)$ where $\eta_{\mathbf{x}}$ and $\eta_{\mathbf{y}}$ are the stepsizes for $\mathbf{x}$ and $\mathbf{y}$ and $\kappa$ is the condition number for $\mathbf{y}$. While this stepsize ratio suggests a slow training of the min player, practical GAN algorithms typically adopt similar stepsizes for both variables, indicating a wide gap between theoretical and empirical results. In this paper, we aim to bridge this gap by analyzing the \emph{local convergence} of general \emph{nonconvex-nonconcave} minimax problems. We demonstrate that a stepsize ratio of $\Theta(\kappa)$ is necessary and sufficient for local convergence of GDA to a Stackelberg Equilibrium, where $\kappa$ is the local condition number for $\mathbf{y}$. We prove a nearly tight convergence rate with a matching lower bound. We further extend the convergence guarantees to stochastic GDA and extra-gradient methods (EG). Finally, we conduct several numerical experiments to support our theoretical findings.